29 research outputs found
Improving QED-Tutrix by Automating the Generation of Proofs
The idea of assisting teachers with technological tools is not new.
Mathematics in general, and geometry in particular, provide interesting
challenges when developing educative softwares, both in the education and
computer science aspects. QED-Tutrix is an intelligent tutor for geometry
offering an interface to help high school students in the resolution of
demonstration problems. It focuses on specific goals: 1) to allow the student
to freely explore the problem and its figure, 2) to accept proofs elements in
any order, 3) to handle a variety of proofs, which can be customized by the
teacher, and 4) to be able to help the student at any step of the resolution of
the problem, if the need arises. The software is also independent from the
intervention of the teacher. QED-Tutrix offers an interesting approach to
geometry education, but is currently crippled by the lengthiness of the process
of implementing new problems, a task that must still be done manually.
Therefore, one of the main focuses of the QED-Tutrix' research team is to ease
the implementation of new problems, by automating the tedious step of finding
all possible proofs for a given problem. This automation must follow
fundamental constraints in order to create problems compatible with QED-Tutrix:
1) readability of the proofs, 2) accessibility at a high school level, and 3)
possibility for the teacher to modify the parameters defining the
"acceptability" of a proof. We present in this paper the result of our
preliminary exploration of possible avenues for this task. Automated theorem
proving in geometry is a widely studied subject, and various provers exist.
However, our constraints are quite specific and some adaptation would be
required to use an existing prover. We have therefore implemented a prototype
of automated prover to suit our needs. The future goal is to compare
performances and usability in our specific use-case between the existing
provers and our implementation.Comment: In Proceedings ThEdu'17, arXiv:1803.0072
Ăvaluation et amĂ©lioration de la qualitĂ© de DBpedia pour la reprĂ©sentation de la connaissance du domaine
RĂSUMĂ
LâĂ©volution rĂ©cente du Web sĂ©mantique, tant par la quantitĂ© dâinformation offerte que par la multiplicitĂ© des usages possibles, rend indispensable lâĂ©valuation de la qualitĂ© des divers ensembles de donnĂ©es (datasets) disponibles. Le Web sĂ©mantique Ă©tant basĂ© sur la syntaxe RDF, i.e. des triplets (par exemple ), on peut le voir comme un immense graphe, oĂč un triplet relie un nĆud « sujet » et un nĆud « objet » par une arĂȘte « relation ». Chaque dataset reprĂ©sente ainsi un sous-graphe. Dans cette reprĂ©sentation, DBpedia, un des datasets majeurs du Web sĂ©mantique, en est souvent considĂ©rĂ© comme le nĆud central. En effet, DBpedia a pour vocation, Ă terme, de pouvoir reprĂ©senter toute lâinformation prĂ©sente dans Wikipedia, et couvre donc une trĂšs grande variĂ©tĂ© de sujets, permettant de faire le lien avec tous les autres datasets, incluant les plus spĂ©cialisĂ©s. Câest de cette multiplicitĂ© des sujets couverts quâapparait un point fondamental de ce projet : la notion de « domaine ». Informellement, nous considĂ©rons un domaine comme Ă©tant un ensemble de sujets reliĂ©s par une thĂ©matique commune. Par exemple, le domaine MathĂ©matiques contient plusieurs sujets, comme algĂšbre, fonction ou addition. Formellement, nous considĂ©rons un domaine comme un sous-graphe de DBpedia, oĂč lâon ne conserve que les nĆuds reprĂ©sentant des concepts liĂ©s Ă ce domaine.
En lâĂ©tat actuel, les mĂ©thodes dâextraction de donnĂ©es de DBpedia sont gĂ©nĂ©ralement beaucoup moins efficaces lorsque le sujet est abstrait, conceptuel, que lorsquâil sâagit dâune entitĂ© nommĂ©e, par exemple une personne, ville ou compagnie. Par consĂ©quent, notre premiĂšre hypothĂšse est que lâinformation disponible sur DBpedia liĂ©e Ă un domaine est souvent pauvre, car nos domaines sont essentiellement constituĂ©s de concepts abstraits. La premiĂšre Ă©tape de ce travail de recherche fournit une Ă©valuation de la qualitĂ© de lâinformation conceptuelle dâun ensemble de 17 domaines choisis semi-alĂ©atoirement, et confirme cette hypothĂšse. Pour cela, nous identifions plusieurs axes permettant de chiffrer la « qualitĂ© » dâun domaine : 1 - nombre de liens entrants et sortants pour chaque concept, 2 - nombre de liens reliant deux concepts du domaine par rapport aux liens reliant le domaine au reste de DBpedia, 3 - nombre de concepts typĂ©s (i.e. reprĂ©sentant lâinstance dâune classe, par exemple Addition est une instance de la classe OpĂ©ration mathĂ©matique : le concept Addition est donc typĂ© si la relation apparait dans DBpedia). Nous arrivons Ă la conclusion que lâinformation conceptuelle contenue dans DBpedia est effectivement incomplĂšte, et ce selon les trois axes.
La seconde partie de ce travail de recherche est de tenter de rĂ©pondre au problĂšme posĂ© dans la premiĂšre partie. Pour cela, nous proposons deux approches possibles. La premiĂšre permet de fournir des classes potentielles, rĂ©pondant en partie Ă la problĂ©matique de la quantitĂ© de concepts typĂ©s. La seconde utilise des systĂšmes dâextraction de relations Ă partir de texte (ORE â Open Relation Extraction) sur lâabstract (i.e. premier paragraphe de la page Wikipedia) de chaque concept. En classifiant les relations extraites, cela nous permet 1) de proposer des relations inĂ©dites entre concepts dâun domaine, 2) de proposer des classes potentielles, comme dans la premiĂšre approche. Ces deux approches ne sont, en lâĂ©tat, quâun dĂ©but de solution, mais nos rĂ©sultats prĂ©liminaires sont trĂšs encourageants, et indiquent quâil sâagit sans aucun doute de solutions pertinentes pour aider Ă corriger les problĂšmes dĂ©montrĂ©s dans la premiĂšre partie.----------ABSTRACT
In the current state of the semantic web, the quantity of available data and the multiplicity of its uses impose the continuous evaluation of the quality of this data, on the various Linked Open Data (LOD) datasets. These datasets are based on the RDF syntax, i.e. triples, such as . As a consequence, the LOD cloud can be represented as a huge graph, where every triple links the two nodes âsubjectâ and âobjectâ, by an edge ârelationâ. In this representation, each dataset is a sub-graph. DBpedia, one of the major datasets, is colloquially considered to be the central hub of this cloud. Indeed, the ultimate purpose of DBpedia is to provide all the information present in Wikipedia, âtranslatedâ into RDF, and therefore covers a wide range of domains, allowing a linkage with every other LOD dataset, including the most specialized. From this wide coverage arises one of the fundamental concepts of this project: the notion of âdomainâ. Informally, a domain is a set of subjects with a common thematic. For instance, the domain Mathematics contains several subjects such as algebra, function or addition. More formally, a domain is a sub-graph of DBpedia, where the nodes represent domain-related concepts.
Currently, the automatic extraction methods for DBpedia are usually far less efficient when the target subject is conceptual than when it is a named entity (such as a person, city or company). Hence our first hypothesis: the domain-related information available on DBpedia is often poor, since domains are constituted of concepts. In the first part of this research project, we confirm this hypothesis by evaluating the quality of domain-related knowledge in DBpedia for 17 domains chosen semi-randomly. This evaluation is based on three numerical aspects of the âqualityâ of a domain: 1 â number of inbound and outbound links for each concepts, 2 â number of links between two domain concepts compared to the number of links between the domain and the rest of DBpedia, 3- number of typed concepts (i.e. representing the instance of a class : for example, Addition is an instance of the class Mathematical operation : the concept Addition is typed if the relation appears in DBpedia). We reach the conclusion that the domain-related, conceptual information present in DBpedia is indeed poor on the three axis.
In the second half of this work, we give two solutions to the quality problem highlighted in the first half. The first one allows to propose potential classes that could be added in DBpedia, addressing the 3rd quality aspect: number of typed concepts. The second one uses an Open Relation Extraction (ORE) system that allows to detect relations in a text. By using this system on the abstract (i.e. the first paragraph of the Wikipedia page) of each concept, and classifying the extracted relation depending on their semantic meaning, we can 1) propose novel relations between domain concepts, and 2) propose additional potential classes. These two methods currently only represent the first step, but the preliminary results we obtain are very encouraging, and seem to indicate that they are absolutely relevant to help correcting the issues highlighted in the first part
Assessing and Improving Domain Knowledge Representation in DBpedia
With the development of knowledge graphs and the billions of triples generated on the Linked Data cloud, it is paramount to ensure the quality of data. In this work, we focus on one of the central hubs of the Linked Data cloud, DBpedia. In particular, we assess the quality of DBpedia for domain knowledge representation. Our results show that DBpedia has still much room for improvement in this regard, especially for the description of concepts and their linkage with the DBpedia ontology. Based on this analysis, we leverage open relation extraction and the information already available on DBpedia to partly correct the issue, by providing novel relations extracted from Wikipedia abstracts and discovering entity types using the dbo:type predicate. Our results show that open relation extraction can indeed help enrich domain knowledge representation in DBpedia
Rapport dâexpertise de la procĂ©dure de qualification des Ă©quipements ILS. Limites et ProbabilitĂ©s de confiance.
L'objet de cette Ă©tude est de clarifier certains passages du rapport "DERA/WSS/WX1/CR 980799/2.3 ILS Certification Requirements". Nous Ă©tudions en particulier le paragraphe 3.3 "Confidence Limits for Sequential Tests" p.28-34 et le paragraphe 2 de l'appendice D, p.69-72
Automating the Generation of High School Geometry Proofs using Prolog in an Educational Context
When working on intelligent tutor systems designed for mathematics education
and its specificities, an interesting objective is to provide relevant help to
the students by anticipating their next steps. This can only be done by
knowing, beforehand, the possible ways to solve a problem. Hence the need for
an automated theorem prover that provide proofs as they would be written by a
student. To achieve this objective, logic programming is a natural tool due to
the similarity of its reasoning with a mathematical proof by inference. In this
paper, we present the core ideas we used to implement such a prover, from its
encoding in Prolog to the generation of the complete set of proofs. However,
when dealing with educational aspects, there are many challenges to overcome.
We also present the main issues we encountered, as well as the chosen
solutions.Comment: In Proceedings ThEdu'19, arXiv:2002.1189
Study of the ILS certification process. Confidence limits and probabilities.
The purpose of this study is to explain some parts of the DERA report DERA/WSS/WX1/CR 980799/2.3. "ILS Certification Requirements" (Ref. 2). We mainly study section 3.3 "Confidence Limits for Sequential Tests" p.28-34 and section 2 of the appendix D, p.69-72
Interleukin-13 Activates Distinct Cellular Pathways Leading to Ductular Reaction, Steatosis, and Fibrosis
Fibroproliferative diseases are driven by dysregulated tissue repair responses and are major cause of morbidity and mortality as they affect nearly every organ system. Type-2 cytokine responses are critically involved in tissue repair; however, the mechanisms that regulate beneficial regeneration versus pathological fibrosis are not well understood. Here, we have shown that the type-2 effector cytokine interleukin-13 simultaneously, yet independently, directed hepatic fibrosis and the compensatory proliferation of hepatocytes and biliary cells in progressive models of liver disease induced by interleukin-13 over-expression or following infection with Schistosoma mansoni. Using transgenic mice with interleukin-13 signaling genetically disrupted in hepatocytes, cholangiocytes, or resident tissue fibroblasts, we have revealed direct and distinct roles for interleukin-13 in fibrosis, steatosis, cholestasis, and ductular reaction. Together, these studies show that these mechanisms are simultaneously controlled but distinctly regulated by interleukin-13 signaling. Thus, it may be possible to promote interleukin-13-dependent hepatobiliary expansion without generating pathological fibrosis
Génération automatique de preuves pour un logiciel tuteur en géométrie
RĂSUMĂ : Le logiciel QED-Tutrix est un tuteur intelligent qui offre un cadre aidant les Ă©lĂšves de secondaire Ă rĂ©soudre des problĂšmes de gĂ©omĂ©trie. Une de ses caractĂ©ristiques fondamentales est lâaspect tuteur, qui accompagne lâĂ©lĂšve dans sa rĂ©solution du problĂšme, et lâaide Ă avancer en cas de blocage par le biais de messages, personnalisĂ©s selon son avancement dans la preuve. Une autre de ses caractĂ©ristiques est de rester proche de la façon dont un Ă©lĂšve rĂ©soudrait le problĂšme avec un papier et un crayon, et, par consĂ©quent, lui permet dâexplorer la construction de sa preuve dans lâordre de son choix. En dâautres termes, si le premier instinct de lâĂ©lĂšve est de remarquer une caractĂ©ristique de la situation gĂ©omĂ©trique nâĂ©tant ni un consĂ©quent direct des hypothĂšses, ni un Ă©lĂ©ment directement nĂ©cessaire Ă la conclusion, QED-Tutrix lui permet de commencer Ă rĂ©soudre le problĂšme Ă partir de ce rĂ©sultat intermĂ©diare. Ces deux caractĂ©ristiques crĂ©ent une difficultĂ© importante. En effet, le logiciel doit fournir une aide personnalisĂ©e selon lâavancement de lâĂ©lĂšve, qui peut contenir des Ă©lĂ©ments de preuve trĂšs variĂ©s et sans nĂ©cessairement de rapport direct entre eux, pouvant par exemple appartenir Ă plusieurs chemins de preuve distincts. Par consĂ©quent, le logiciel doit connaĂźtre Ă lâavance les diffĂ©rents chemins de rĂ©solution possibles pour chaque problĂšme, et ĂȘtre capable dâanalyser ces chemins pour dĂ©terminer dans quelle direction lâĂ©lĂšve se dirige. Pour rĂ©soudre cet enjeu, le logiciel associe Ă chaque problĂšme une structure de donnĂ©es, nommĂ©e le graphe HPDIC et prĂ©sentĂ©e plus en dĂ©tail dans les sections centrales de ce document, qui permet de reprĂ©senter sous forme de graphe toutes les preuves possibles pour un problĂšme donnĂ©. Ainsi, il devient possible dâanalyser finement lâavancement de lâĂ©lĂšve dans sa rĂ©solution du problĂšme, et ce quel que soit lâordre dans lequel il a fourni les Ă©lĂ©ments de la preuve.----------ABSTRACT : The QED-Tutrix project has the ultimate goal of creating, improving, and expanding the eponym tutor software, to accompany high school students in the resolution of geometry problems. Based on proven mathematics education theories, it has several distinctive characteristics. First, as a tutor software, the tutoring aspect is paramount. The virtual tutor, aptly named Professor Turing, is tasked with helping the student through the entire resolution process, from the exploration of the problem to the redaction of the proof. Second, the software as a whole stays close to the way students would typically solve a problem with paper and pencil, while adding digital tools to improve this process. This includes a dynamic geometry interface to explore the problem, a listing of available justifications, such as definitions or theorems, and a semi-automated redaction engine. Further, the student is allowed, like in real life, to construct the proof by adding elements in any order, for example by noticing a particular geometrical result on the figure, that is neither a hypothesis of the problem or its conclusion. It is possible in QED-Tutrix to begin the proof âin the middleâ, by stating this particular result and working from there.These educational goals create interesting issues from a computer science point of view. Indeed, the tutoring is tailored to the exact point of the resultion reached by the student, and this point can be quite discontinuous since the elements of the resolution may have been entered in any order, can be unrelated, or pertain to different possible proofs of the problem. As a logical consequence, in order for the software to âknowâ what the student is doing and, mostly, what he or she is missing to complete the resolution, it is necessary to know, beforehand, all the possible proofs for the problem